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Abstract. The concept of neutral evolutionary networks being a signifi- 
cant factor in evolutionary dynamics was first proposed by Huynen et al. 
about 7 years ago. In one sense, the principle is easy to state — because 
most mutations to an organism are deleterious, one would expect that 
neutral mutations that don't affect the phenotype will have dispropor- 
tionately greater representation amongst successor organisms than one 
would expect if each mutation was equally likely. 

So it was with great surprise that I noted neutral mutations being very 
rare in a visualisation of phylogenetic trees generated in Tierra, since I 
already knew that there was a significant amount of neutrality in the 
Tierra genotype-phenotype map. The paper reports on an investigation 
into this mystery. 



1 Introduction 

The influence of neutral networks in evolutionary processes was first elucidated 
by Peter Schuster's group in Vienna in 1996^B . Put simply, two genotypes are 
considered neutrally equivalent if they map to the same phenotype. A neutral 
network is a set of genotypes connected by this neutrality relationship on links 
with Hamming distance 1 (ie each link of the network corresponds to a mutation 
at a single site of the genome). 

These researchers noted that evolution tended to proceed by diffusion along 
these neutral networks, punctuated occasionally by rapid changes to phenotypes 
as an adaptive feature is discovered. The similarity of these dynamics with the 
theory of Punctuated Equilibria^] was noted by Barnett 2 . It was also noted 
that if a giant network existed that came within a hop or two of every possible 
genotype, evolution will be particularly efficient at discovering solutions. 

Most work on neutrality in evolution uses the genotype-phenotype mapping 
defined by folding of RNA0. This mapping is implemented in the open source 
Vienna RNA package^, so is a convenient and well known testbed for ideas of 
neutrality in evolution. 

Also in 1996, I developed a definition of the genotype-phenotype mapping 
for Tierra, which was first published in 1997|B1. ^ noticed the strong presence of 
neutrality in this mapping at that time, which was later exploited to develop 

^ http://www.tbi.univie.ac.at/~ivo/RNA 



a measure of complexity of the Tierran organism |lll9| . In 2002, I started a 
programme to visualise Tierra's phylogenetic trees and neutral networks |10| in 
order to "discover the unexpected" . Two key findings came out of this: the first 
being that Tierra's genebanker data did not provide clean phylogenetic trees, but 
had loops, and consisted of many discontinuous pieces. This later turned out to 
be due to Tierra's habit of reusing genotype labels if those genotypes were not 
saved in the genebanker database. This might happen if the population count 
of that genotype failed to cross a threshold. This is all very well, except that a 
reference to that genotype exists in the parent field of successor genotypes. The 
second big surprise was the paucity of neutral mutations in the phylogenetic tree. 
We expect most mutations to an organism to be deleterious, and so expect that 
neutral mutations will have disproportionately greater representation amongst 
successor genotypes than one would expect if each mutation was equally likely. 

2 Neutrality in Tierra 

Tierra^ is a well known artificial life system in which small self-replicating com- 
puter programs are executed in specially constructed simulator. These computer 
programs (called digital organisms, or sometimes "critters" ) undergo mutation, 
and radically novel behaviour is discovered, such as parasitism and hyperpara- 
sitism. 

It is clear what the genotype is in Tierra, it is just the listing of the program 
code of the organism. The phenotype is a more diffuse thing, however. It is the 
resultant effect of running the computer program, in all possible environments. 
Christoph Adami defined this notion of phenotype for a similar artificial life sys- 
tem called j4?;z(iam. In Avida, things are particularly simple, in that organisms 
either reproduce themselves at a fixed replication rate, or don't as the case may 
be, and optionally perform range of arithmetic operations on special registers 
(defined by the experimenter). 

In Tierra, organisms do interact with each other via a template matching 
mechanism. For example, with a branching instruction like jmpo, if there is a 
sequence of nopO and nopl instructions (which are nooperations) following the 
branch, this sequence of Is and Os is used as a template for determining where to 
branch to. In this case the CPU will search outwards through memory for a com- 
plementary sequence of nopOs and nopls. If the nearest complementary sequence 
happens to lie in the code of a different organism, the organisms interact. 

To precisely determine the phenotype of a Tierran organism, one would need 
to execute the soup containing the organism and all possible combinations of 
other genotypes. Whilst this is a finite task, it is clearly astronomically difficult. 
One means of approximation is to consider just interaction of pairs of genotypes 
(called a tournament). Most Tierran organisms interact pairwise — very few 
triple or higher order interactions exist. Similarly, rather than running tourna- 
ments with all possible genotypes, we can approximate matters by using the 
genotypes stored in a genebanker database after a Tierra run. In practice, it 
turns out that various measures, such as the number of neutral neighbours, or 



the total complexity of an organism are fairly robust with respect to the exact 
set of organism used for the tournaments. 

So the procedure is to pit pairwise all organisms in the genebanker against 
themselves, and record the outcome in a table (there is a small number of possible 
outcomes, which is detailed in JB. ). A row of this table is a phenotypic signature 
for the genotype labeling that row. We can then eliminate those genotypes with 
identical signatures in favour of one canonical genotype. This list of unique 
phenotypes can be used to define pragmatically a test for neutrality of two 
different organisms. Pit each organism against the list of unique phenotypes, and 
if the signatures match, we have neutrality. The source code for this experiment 
is available from the author's website.^ 

Tierra has three different modes of mutation: 

Cosmic Ray A site of the soup is randomly chosen and mutated; 
Copy Data is mutated during the copy operation; 
Flaw Instructions occasionally produce erroneous results 

Furthermore, in the case of cosmic ray and copy mutations, a certain proportion 
of mutations involve bitflips, rather than opcodes being substituted uniformly. 
This proportion is set as a parameter in the soup_in file (MutBitProp) — in 
these experiments, this parameter is set to zero. 

In order to study the issue of whether neutrality is greater or less than ex- 
pected in Tierra, I generated three datasets with each of the 3 modes of mutation 
operating in isolation. The sizes of each data set was 69,139, 87,003 and 198,982 
genotypes respectively, generated over a time period of about 1000 million exe- 
cuted instructions. Genebanker 's threshold was set to zero, so all genotypes were 
captured. This led to a proper phylogenetic tree. After performing a neutrality 
analysis, a set of 83, 86 and 158 unique phenotypes was extracted as the test set 
for the tournaments. 

Since the neighbourhood size increases exponentially with neighbourhood 
diameter, I restrict analysis to single site, or point mutations. In each data set, 
around 7% of these genotypes were created were created by a mutation at a single 
site and were neutrally equivalent to its parent. For each of these, I compute the 
number of neutral neighbours rii existing in the 1 hop neighbourhood of the 
parent genotype i, which is of size 32^', where ii is the length of the genome. 
For a given parent i, the ratio 

n = (1) 

Oin.i 

gives the proportion of neutral links actually followed relative to the number of 
neutral links available {neutrality excess), where i>i is the number of neutrally 
equivalent offspring, and Oi the total number of offspring and the size of the 1 
hop neutral neighbourhood. Fig. ^ shows the running average of this quantity 
over these transitions, with the genotypes numbered in size order. 

^ http://parallel.hpc.unsw.edu.aU/getaegisdist.cgi/getsource/eco-tierra.3, version 
3.D3 



In this analysis, no selection is operating, so one would expect that the neu- 
trality excess should be identical to 1. However, in the case of instruction flaws, 
it is rather unpredicatable what the effect is. In the case of cosmic ray muta- 
tions, 50% of time one would expect the parent to be mutated, rather than the 
daughter. In the case of a mutation affecting a crucial gene of a parent genotype, 
the organism may not be able to reproduce at all, thus favouring neutral muta- 
tions. Only copy mutations should affect all sites of the genome equally, leading 
to a neutrality excess equal to one. The measured value, however is about 1.3, 
substantially greater than one. The reason for this is not known at this point in 
time. 

The the datasets were further subsetted to include just those transitions 
whose daughter organism successfully reproduced, ie with a maximum popula- 
tion count greater than 1. The neutrality excess in this case is substantially less 
than 1, so something in Tierran evolution is favouring nonncutral evolution. 



1.6 
1.4 
1.2 

1 

0.8 
0.6 
0.4 
0.2 





copy, no selection ^ 
copy, with selection : 
cosmic, no selection ^ 
cosmic, with selection ; 

flaw, no selection ^ 
^ ^ '( l awt -wi th4Ge l @ct j«n4.r 



2000 4000 6000 8000 10000 

genome number 



12000 



14000 



16000 



Fig. 1. Running average of neutrality excess ((r;)). Genomes are ordered ac- 
cording to size, and neutrality excess is averaged over all genomes to the left of 
that data point. Three different datasets are analysed, with each of the three 
modes of mutation turned on. Then the datasets are further filtered to only in- 
clude offspring whose maximum population count is greater than 1, ie selection 
is operating. 



3 Vienna RNA folding experiments 

It is quite well known that evolution using the RNA folding map*?" exhibits a 
great deal of neutrality, at least for a standard genetic algorithm optimising a 
well defined fitness function. Tierra is a coevolutionary system, and does not 



have a well defined fitness function. Rather, the chance of an organism surviving 
at each time step depends on what other organisms are in the environment at 
the time, and has a significant component of contingency. 

One possible cause for the repression of neutrality in Tierra is this coevolu- 
tionary nature. The reasoning follows from the idea of the Red Queen effec^[T^. 
This says that organisms must continuously evolve just to remain adaptive.^ In 
such a circumstance, neutral evolution is maladaptive, and surely be suppressed. 

A convincing argument in favour of this hypothesis would be the demonstra- 
tion of a coevolutionary system based on the RNA folding map that exhibited 
this repression of neutrality. Whilst I haven't achieved this goal, I will report on 
a couple of attempts. 

The first attempt is an instantiation of the simplest possible coevolutionary 
system. It consists of two populations: a tracker population T which attempts 
to be as similar to the other population as possible, and an evader population 
E that attempts to be as different from the tracker population as possible. The 
Vienna RNA folding library is used, and fitness functions fT{x,E), fE{x,T) 
defined for the trackers and evaders respectively, based on the average distance 
between their phenotypes: 

/T(a;,£;) = --^^d(a;,y),Va;Gr 

^ yes 

/£;(x,r) = ^^d(a;,y),Va;ei? (2) 

where Nt and Ne are the population counts of trackers and evaders respectively, 
and d{x, y) is the string edit distance^ between the folded structure of x and y. 

I implemented a simple genetic algorithm with just a point mutation oper- 
ator. All RNA strings have the same length. During reproduction, each RNA 
string copies itself, possibly with a mutation. During the selection step, the least 
fit 50% of organisms are culled, bringing the population count back to the start- 
ing value. For the results presented in Fig. [2 the GA parameters are shown in 
TableQ] Source code for this experiment is available from the author's website.^ 



String length: 20 

Population size: 100 

Mutation probability per site: 0.1 
Table 1. Genetic Algorithm parameters for the RNA folding experiment re- 
ported in the paper 



^ Like the Red Queen in Lewis CaroU's Through the Looking Glass, who had to keep 
running, just to stay where she was. 

Please consult the Vienna RNA package documentation for a precise definition of 
string edit distance 

^ http://parallel.hpc.unsw.edu.au/getaegisdist.cgi/getsource/rnafold/, version Dl 




Fig. 2. Neutrality excess for the RNA folding experiment, as a function of time. 
Separate lines are plotted for trackers, evaders and trackers tracking a fixed 
target. 



Fig. [3 shows the neutrality excess for trackers and evaders, defined in the 
same way as eq A third baseline run of a single population tracking a fixed 
target is also shown (corresponding to a classic fixed fitness function genetic 
algorithm). The most obvious thing about these results is that neutrality is 
highly adaptive to evaders, who clearly are trying to make themselves occupy as 
small a footprint in phenotype space as possible. Given the evaders propensity 
to stay still, trackers will tend to behave like their fixed target counterparts. 

This indicated that trackers were dominating over the evaders. Another ex- 
periment I performed was where all coevolving populations were symmetric. In 
this case, each population would track a second and evade a third population. 
The simplest arrangement of these has 3 coevolving populations in a "Rock, Scis- 
sors, Paper" configuration, so no one population dominates. I also tried other 
combinations up to 10 separately coevolving populations, wired up randomly to 
each other (but fixed at the start of the experiment). In all cases, the results 
were much the same — the neutrality excess was less than for the trackers in 
Fig- HI but still just slightly greater than 1. 

So how do can we introduce the Red Queen effect to this system? One possi- 
bility I haven't explored as yet is to somehow give evaders room to move, perhaps 
by including a genome lengthening operator as part of the GA. 

4 Conclusion 

The suppression of neutrality in Tierran evolution is a real effect. It is quite likely 
that this is a Red Queen effect, with organisms needing to change to remain 
adaptive. Experiments with using the RNA folding map to try to reproduce this 



effect have proven inconclusive. However, it was noted that there is significant 
evolutionary pressure to increase neutrality in evading populations. 
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